Skip to content

Auto generated client#3

Merged
njbrake merged 6 commits into
mainfrom
codegen/control-plane-preview
Jun 8, 2026
Merged

Auto generated client#3
njbrake merged 6 commits into
mainfrom
codegen/control-plane-preview

Conversation

@njbrake

@njbrake njbrake commented Jun 5, 2026

Copy link
Copy Markdown
Member

Note: this PR description was drafted by Claude via back-and-forth with @njbrake. The reasoning and decisions are his; the prose is Claude's.

Summary

Rebuilds the Python SDK as "Option C": a thin hand-written shell over an OpenAPI-generated typed client core (src/otari/_client) generated from the gateway's OpenAPI spec, replacing the previous hand-maintained openai-SDK inference wrapper. The generated core models otari's actual contract (including endpoints the OpenAI SDK cannot represent).

  • Generated typed core covers every endpoint: chat, responses, messages, embeddings, moderations, rerank, models, batches, and the control plane (keys/users/budgets/pricing/usage). Replaces the old _control_plane-only subpackage.
  • Hand-written shell: ergonomic methods (completion, response, message, embedding, moderation, rerank, list_models, batch ops, control_plane), an SSE streaming shim (sync + async, since the generated client cannot stream), typed error mapping (ApiException to the typed OtariError hierarchy) applied in both auth modes, and auth-mode resolution (Otari-Key for inference vs Authorization: Bearer for platform/control-plane).
  • New: the Anthropic-shaped /messages endpoint, not previously surfaced by the SDK.
  • Drift gate: tests/unit/test_endpoint_coverage.py fails when the gateway exposes an endpoint the SDK does not account for.
  • Drops the runtime openai dependency from the inference path.

Validation

  • ruff + mypy --strict clean; 85 unit tests pass on 3.11 to 3.13; 6 control-plane integration tests pass against a live gateway.
  • Live smoke against a real gateway + OpenAI: sync streaming (10 chunks) and async streaming (6 chunks) verified incremental, embeddings (1536-dim), and typed error mapping. Details in the smoke-test comment on this PR.

Follow-ups (non-blocking)

  • Ergonomic aliases for the control-plane methods, which currently use raw generated names: #107.

Relates to #96.

njbrake and others added 6 commits June 5, 2026 17:15
Generated control-plane client (keys, users, budgets, pricing, usage) from
the otari gateway OpenAPI spec via OpenAPI Generator. Preview for review and
integration design; not yet wired into the public client.

Part of mozilla-ai/otari#96

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Full CRUD lifecycle tests for every control-plane endpoint (keys, users,
budgets, pricing, usage) driving the generated client against a real gateway
started on SQLite with a master key (no provider creds needed). Verified
22 endpoint operations pass; documents the Bearer-auth requirement and the
per-language test pattern.

Part of mozilla-ai/otari#96

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- Regenerate the control-plane client as a clean subpackage (otari._control_plane).
- Add OtariClient.control_plane: typed accessors (keys/users/budgets/pricing/usage)
  sharing one client authed with Authorization: Bearer using an admin credential
  (admin_key / GATEWAY_ADMIN_KEY / platform_token). Management endpoints require
  Bearer, not the Otari-Key inference header.
- Add admin_key param + ControlPlane facade; export ControlPlane.
- Exclude the generated subpackage from ruff/mypy; add its runtime deps.
- Integration tests now drive the public control_plane surface end-to-end
  (full CRUD per endpoint + admin-credential guard); 6 pass against a live
  gateway on SQLite.

Part of mozilla-ai/otari#96

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…eway

- Add .github/workflows/ci.yml: ruff + mypy + pytest on push/PR across
  Python 3.11-3.13 (the repo previously only ran checks on release).
- Integration tests skip cleanly when no gateway is on PATH (set
  OTARI_GATEWAY_CMD to run them), so CI is green without a gateway and
  passes with one.
- Register the 'integration' marker; per-file-ignore the subprocess/URL
  audit rules for the test harness.

Part of mozilla-ai/otari#96

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Replace the OpenAI-SDK delegation with a hand-written shell over the
OpenAPI-generated core (otari._client), generated from the otari spec.
The gateway is only OpenAI-compatible and has endpoints (notably the
Anthropic-shaped /messages) the OpenAI SDK cannot represent; generating
from the otari spec models the real contract.

Core (unmodified generator output) replaces the old _control_plane
subpackage and covers every endpoint with typed request/response models.

Shell (the real work):
- Auth modes preserved: platform -> Authorization: Bearer <token>;
  non-platform -> Otari-Key: Bearer <key>; control-plane -> Authorization:
  Bearer <admin/master key>. Fed into the generated ApiClient default
  headers and the streaming shim's httpx requests.
- Ergonomic methods keep the existing public names/signatures
  (completion/response/embedding/moderation/rerank/list_models/batches)
  and add message(...) for the previously-missing /messages endpoint.
  control_plane accessor now backed by _client.
- SSE streaming shim (_streaming.py): the generated core buffers and
  cannot stream, so stream=True does a raw httpx streaming POST, parses
  text/event-stream framing, terminates on [DONE], and yields typed
  ChatCompletionChunk (chat) / parsed JSON (responses, messages).
- Typed error mapping: generated ApiException (.status/.body) -> the
  errors.py hierarchy (Authentication 401/403, InsufficientFunds 402,
  ModelNotFound 404, BatchNotComplete 409, RateLimit 429, GatewayTimeout
  504, UpstreamProvider 502/5xx, UnsupportedCapability cross-mode, generic
  OtariError). The streaming path adapts failed responses through the same
  mapper.
- Async client wraps the sync generated core via asyncio.to_thread and
  streams natively over httpx.AsyncClient.

Drop the openai dependency from inference. Update ruff/mypy excludes to
src/otari/_client. Rewrite unit tests to mock the generated transport
(RESTClientObject.request) and respx for SSE; control-plane integration
tests updated to import from _client and verified green against a live
gateway.

Note: chat streaming cannot be live-verified in this sandbox (no provider
key on the gateway); the SSE shim is unit-tested over mocked chunk bytes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Fail CI when the gateway OpenAPI spec exposes an endpoint the SDK's public
API does not account for. A checked-in manifest (sdk-endpoints.txt) pins the
covered and intentionally-excluded endpoint sets; a pytest test fetches the
canonical spec and asserts spec is a subset of (covered + excluded), naming
any unaccounted endpoint. Wired as a dedicated CI step.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@njbrake

njbrake commented Jun 8, 2026

Copy link
Copy Markdown
Member Author

Note: this comment was drafted by Claude via back-and-forth with @njbrake. The reasoning and decisions are his; the prose is Claude's.

Live smoke test (Python)

Ran this branch against a local standalone gateway (http://127.0.0.1:8000) wired to a real OpenAI provider, using a minted virtual inference key over Otari-Key auth (OtariClient(api_base, api_key=...)). Models: openai:gpt-4o-mini for chat/stream, openai:text-embedding-3-small for embeddings.

  • Non-streaming chat (completion): returned a typed ChatCompletion.
  • Sync streaming (completion(..., stream=True)): iterated the chunk iterator and received 10 content chunks live from real OpenAI, printed incrementally.
  • Async streaming (AsyncOtariClient, await completion(..., stream=True) then async for): 6 content chunks, live and incremental. Confirms the separate async shim path.
  • Embeddings: 1536-dimension vector.
  • Error mapping: a bad key raised the typed AuthenticationError; a priced-but-nonexistent model raised UpstreamProviderError (the gateway wraps unknown-provider-model failures as upstream errors).

All paths pass end to end against a real gateway and provider, including live multi-chunk streaming on both the sync and async clients.

@njbrake njbrake changed the title Add generated control-plane client (preview) Auto generated client (preview) Jun 8, 2026
@njbrake njbrake marked this pull request as ready for review June 8, 2026 12:39
@njbrake njbrake changed the title Auto generated client (preview) Auto generated client Jun 8, 2026
@njbrake njbrake merged commit c57cb5e into main Jun 8, 2026
3 checks passed
@njbrake njbrake deleted the codegen/control-plane-preview branch June 8, 2026 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants